{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/node_postprocessor/CohereRerank.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cohere Rerank"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-postprocessor-cohere-rerank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/suo/miniconda3/envs/llama/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "from llama_index.core import (\n",
    "    VectorStoreIndex,\n",
    "    SimpleDirectoryReader,\n",
    "    pprint_response,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Download Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p 'data/paul_graham/'\n",
    "!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# load documents\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()\n",
    "\n",
    "# build index\n",
    "index = VectorStoreIndex.from_documents(documents=documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve top 10 most relevant nodes, then filter with Cohere Rerank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from llama_index.postprocessor.cohere_rerank import CohereRerank\n",
    "\n",
    "\n",
    "api_key = os.environ[\"COHERE_API_KEY\"]\n",
    "cohere_rerank = CohereRerank(api_key=api_key, top_n=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=10,\n",
    "    node_postprocessors=[cohere_rerank],\n",
    ")\n",
    "response = query_engine.query(\n",
    "    \"What did Sam Altman do in this essay?\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: Sam Altman agreed to become the president of Y\n",
      "Combinator in October 2013. He took over starting with the winter 2014\n",
      "batch, and worked with the founders to help them get through Demo Day\n",
      "in March 2014. He then reorganised Y Combinator to be controlled by\n",
      "someone other than the founders, so that it could last for a long\n",
      "time.\n",
      "______________________________________________________________________\n",
      "Source Node 1/2\n",
      "Document ID: c1baaa76-acba-453b-a8d1-fdffbde1f424\n",
      "Similarity: 0.845305\n",
      "Text: day in 2010, when he was visiting California for interviews,\n",
      "Robert Morris did something astonishing: he offered me unsolicited\n",
      "advice. I can only remember him doing that once before. One day at\n",
      "Viaweb, when I was bent over double from a kidney stone, he suggested\n",
      "that it would be a good idea for him to take me to the hospital. That\n",
      "was what it ...\n",
      "______________________________________________________________________\n",
      "Source Node 2/2\n",
      "Document ID: abc0f1aa-464a-4ae1-9a7b-2d47a9dc967e\n",
      "Similarity: 0.6486889\n",
      "Text: due to our ignorance about investing. We needed to get\n",
      "experience as investors. What better way, we thought, than to fund a\n",
      "whole bunch of startups at once? We knew undergrads got temporary jobs\n",
      "at tech companies during the summer. Why not organize a summer program\n",
      "where they'd start startups instead? We wouldn't feel guilty for being\n",
      "in a sense...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Directly retrieve top 2 most similar nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    similarity_top_k=2,\n",
    ")\n",
    "response = query_engine.query(\n",
    "    \"What did Sam Altman do in this essay?\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Retrieved context is irrelevant and response is hallucinated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Final Response: Sam Altman was one of the founders of Y Combinator, a\n",
      "startup accelerator. He was part of the first batch of startups funded\n",
      "by Y Combinator, which included Reddit, Justin Kan and Emmett Shear's\n",
      "Twitch, and Aaron Swartz. He was also involved in the Summer Founders\n",
      "Program, which was a summer program where undergrads could start their\n",
      "own startups instead of taking a summer job at a tech company. He also\n",
      "helped to develop a new version of Arc, a programming language, and\n",
      "wrote a book on Lisp.\n",
      "______________________________________________________________________\n",
      "Source Node 1/2\n",
      "Document ID: abc0f1aa-464a-4ae1-9a7b-2d47a9dc967e\n",
      "Similarity: 0.7940524933077708\n",
      "Text: due to our ignorance about investing. We needed to get\n",
      "experience as investors. What better way, we thought, than to fund a\n",
      "whole bunch of startups at once? We knew undergrads got temporary jobs\n",
      "at tech companies during the summer. Why not organize a summer program\n",
      "where they'd start startups instead? We wouldn't feel guilty for being\n",
      "in a sense...\n",
      "______________________________________________________________________\n",
      "Source Node 2/2\n",
      "Document ID: 5d696e20-b496-47f0-9262-7aa2667c1d96\n",
      "Similarity: 0.7899270712205545\n",
      "Text: at RISD, but otherwise I was basically teaching myself to paint,\n",
      "and I could do that for free. So in 1993 I dropped out. I hung around\n",
      "Providence for a bit, and then my college friend Nancy Parmet did me a\n",
      "big favor. A rent-controlled apartment in a building her mother owned\n",
      "in New York was becoming vacant. Did I want it? It wasn't much more\n",
      "tha...\n"
     ]
    }
   ],
   "source": [
    "pprint_response(response)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
